<html>
<head>
<meta NAME="description" CONTENT="VMN in Marvin">
<meta NAME="keywords" CONTENT="VMN, Markush, DARC, Java, Marvin">
<meta NAME="author" CONTENT="Nora Mate">
<link REL ="stylesheet" TYPE="text/css" HREF="../marvinmanuals.css" TITLE="Style">
<title>VMN file format in Marvin</title>
</head>
<body>

<h1>Markush DARC format</h1>

<p>
Codename: <strong>vmn</strong>
</p>
<h2>Contents</h2>
<ul>
    <li><a href="#import">VMN import</a>
        <ul>
	<li><a href="#features">Interpretation of VMN features</a></li>
	<li><a href="#abbrevgroups">Structure shortcuts (abbreviated groups)</a></li>
	<li><a href="#peptides">Amino acids (peptides)</a></li>
	<li><a href="#homologies">Superatoms (homology pseudo atoms)</a></li>
	<li><a href="#multipleattachments">Temporary representation of multiple attachment R-groups</a></li>
	<li><a href="#importlimitations">Limitations</a></li>
	</ul>
    </li>
    <li><a href="#export">VMN export</a></li>
    <li><a href="#references">References</a></li>
</ul>

<a name="import"></a><h2>Import from VMN format</h2>
<p>Markush structure files complying the Markush DARC format are processed by Marvin, though with some 
<a href="#importlimitations">limitations</a>. Interpretation of the main VMN features are 
<a href="#features">listed below</a>. Multiple (more than 2) attachment R-groups 
are temporarily represented by attached data, <a href="#multipleattachments">as described below</a>.
This representation will be changed to its final version for the next major release.
Peptide connection bonds may not be processed correctly and 
<a href="#missingpeptides">some peptides are not supported</a> yet. 
The AMN file is looked for in the directory of
the VMN file, with the same name and <code>.amn</code> extension 
(e.g. the AMN file of <code>46mrk001.vmn</code> is <code>46mrk001.amn</code>). 
The AMN attributes and <a href="#missingattributes">some of the atom attributes</a> 
are not processed by search and enumeration but displayed in atom labels.</p>

<a name="features"></a><h3>Interpretation of VMN features</h3>

<p>
<ul>
<li><b>Groups</b>: <code>G0</code> is read in as root structure while <code>G1</code>, <code>G2</code>, ...
are stored in corresponding R-groups <code>R1</code>, <code>R2</code>, ... The representation of
attachments is <a href="#multipleattachments">described below</a>.</li>
<li><b>Undefined attachment information</b> is stored in form of 
<a href="../sketch/sketch-basic.html#positionvariation">position variation (variable attachment) bonds</a>.</li>
<li><b>Moieties</b> are represented as 
<a href="../sketch/sketch-basic.html#repeatingunit">repeating units with repetition ranges</a>.</li>
<li><b>Repeating units</b> other than moieties are described using a numerotation (numbering) attribute of three digits (e.g. <code>100</code>) on the atoms of the repeating unit, with the appropriate repetition range text recorded in the corresponding AMN file (e.g. <code>M100=1-4</code>).</li>
<li><a href="#abbrevgroups"><b>Structure shortcuts</b></a> are read in as specific built-in
<a href="../sketch/sketch-chem.html#sgroups">abbreviated groups (superatom groups)</a>.</li>
<li><a href="#peptides"><b>Amino acids</b></a> are read in by <a href="seq-doc.html">peptide import</a>, 
which uses built-in <a href="../sketch/sketch-chem.html#sgroups">abbreviated groups (superatom groups)</a> 
to represent peptides.</li>
<li><a href="#homologies"><b>Superatoms</b></a> are read in as pseudo-atoms and treated as homology groups.</li>
<li><b>Atom attributes</b>: we interpret the following VMN atom attributes:
    <ul>
    <li><b>AM - Abnormal mass</b> is stored in the mass number of the atom</li>
    <li><b>AV - Abnormal valence</b> is stored in the valence property of the atom</li>
    <li><b>DL - Peptide attribute</b> is interpreted by <a href="seq-doc.html">peptide import</a></li>
    </ul>
</li>
</ul>
</p>

<a name="abbrevgroups"></a><h3>Structure shortcuts (abbreviated groups)</h3>

<p>The following structure shortcuts (abbreviated groups) are supported:</p>
<p>
    <table cellspacing="10">
        <tr>
	    <td>ACE</td><td>BU</td><td>C2, C3, ..., C20</td><td>CN</td><td>CO1</td><td>CO2</td>
        </tr>
        <tr>
	    <td>COI</td><td>ET</td><td>IBU</td><td>IPR</td><td>MBE</td><td>NBU</td>
        </tr>
        <tr>
	    <td>NO2</td><td>NPR</td><td>OBE</td><td>PBE</td><td>PH</td><td>PO3</td>
        </tr>
        <tr>
	    <td>PO4</td><td>SBU</td><td>SO2</td><td>SO3</td><td>TBU</td>
        </tr>
</table> 
</p>

<a name="peptides"></a><h3>Amino acids (peptides)</h3>

<p>The following amino acids (peptide abbreviated groups) are supported:</p>
<p>
    <table cellspacing="10">
	<tr>
	    <td>ALA</td><td>ARG</td><td>ASN</td><td>ASP</td><td>CYS</td><td>GLN</td>
        </tr>
        <tr>
	    <td>GLU</td><td>GLY</td><td>HIS</td><td>ILE</td><td>LEU</td><td>LYS</td>
        </tr>
        <tr>
	    <td>MET</td><td>PHE</td><td>PRO</td><td>SER</td><td>THR</td><td>TRY</td>
        </tr>
        <tr>
	    <td>TYR</td><td>VAL</td>
	</tr>
    </table>
</p>
<p>For more information, refer to the <a href="seq-doc.html">Peptide import</a> documentation.</p>

<a name="homologies"></a><h3>Superatoms (homology pseudo atoms)</h3>

<p>Superatoms representing homology groups are read in as pseudo atoms. The following homologies
are interpreted by enumeration and search:</p>
<p>
    <table cellspacing="10">
	<tr>
	    <td>CHK</td><td>CHE</td><td>CHY</td><td>CYC</td><td>ARY</td><td>HET</td>
        </tr>
        <tr>
	    <td>HEA</td><td>HEF</td><td>UNK</td><td>MX</td><td>AMX</td><td>A35</td>
        </tr>
        <tr>
	    <td>TRM</td><td>LAN</td><td>ACT</td><td>HAL</td><td>ACY</td><td>PRT</td>
	   	</tr>
        <tr>
	    <td>XX</td>
	</tr>
    </table>
</p>
<p>For a detailed description of the interpretation, refer to the 
<a href="../calculations/homologygroups.html#definition">Homology groups in Markush structures</a>
manual.</p>


<a name="multipleattachments"></a><h3>Temporary representation of multiple attachment R-groups</h3>

<p><b>Note, that the current representation is temporary and will be replaced by a final solution
in the next major release.</b></p>

<p>The usual attachment point representation can be used for 1- and 2-attachment R-groups:</p>
<ul>
<li><p>R-atom attachment order is determined by the neighbor atom index order, the second order neighbor 
is marked with a <code>&quot;</code> on the attachment bond:</p>
<p><img src="vmn_1.png" alt="second-order R-atom neighbor" width="168" height="91"></p>
</li>
<li><p>Attachment points in R-group definitions are marked with <code>*</code> and <code>*&quot;</code> signs:</p>
<p><img src="vmn_2.png" alt="attachments in R-group definition" width="282" height="133"></p>
</li>
</ul>

<p>For R-groups with more than 2 attachments we use the following temporary representation:</p>
<ul>
<li><p>R-atom attachment order numbers is stored in <code>ORDER R&lt;n&gt;</code> attached data fields, 
where <code>n</code> is the R-group ID (e.g. <code>ORDER R1</code> for <code>R1</code>, 
<code>ORDER R2</code> for <code>R2</code>, etc.):</p>
<p><img src="vmn_3.png" alt="R-atom attachment order" width="266" height="153"></p>
</li>
<li><p>Attachment point numbers in R-group definitions are stored in the <code>ATTACH</code> attached 
data field:</p>
<p><img src="vmn_4.png" alt="attachment points in R-group definition" width="383" height="153"></p>
</li>
</ul>
<p>Attached data can also be set manually in MarvinSketch, as described in the
<a href="../sketch/sketch-chem.html#attacheddata">MarvinSketch Chemical Features manual</a>.</p>
<p>Both the attachment order numbers and the attachment point numbers are consecutive positive integers 
<code>1, 2, 3, ...</code>, where attachment <code>i</code> is to be connected to the R-atom neighbor 
with order number <code>i</code>, for each <code>i</code>. Multiple attachment numbers on the same atom
are separated by commas.</p>
<p>Attachment point numbers can be omitted for single atom R-group definition members. If the order numbers
are missing for an R-atom, then the atom index order of the neighboring atoms is used to determine the 
order numbers.</p>


<a name="importlimitations"></a><h3>Limitations</h3>

<a name="missingpeptides"></a><h4>Peptides that are not supported</h4>

<p>The following peptides are not supported by <a href="seq-doc.html">Peptide import</a> 
(and therefore are not read in from VMN files):</p>
<p>
    <table cellspacing="5">
	<tr><td>ABU</td><td>aminobutyric acid</td></tr>
	<tr><td>ASU</td><td>aminosuberic acid</td></tr>
	<tr><td>GLP</td><td>pyroglumatic acid</td></tr>
	<tr><td>HCY</td><td>homocysteine</td></tr>
	<tr><td>HSE</td><td>homoserine</td></tr>
	<tr><td>NLE</td><td>norleucine</td></tr>
	<tr><td>NVA</td><td>norvaline</td></tr>
	<tr><td>ORN</td><td>ornithine</td></tr>
	<tr><td>SAR</td><td>sarcosine</td></tr>
	<tr><td>STA</td><td>statine</td></tr>
    </table>
</p>
<p>Note, that peptide connection bonds are not handled currently, 
therefore peptide sequences may not be correct.</p>

<a name="missinghomologies"></a><h4>Superatoms (homologies) that are not supported</h4>

<p>The following superatoms (homologies) are not supported by search and enumeration
(but read in and displayed as pseudo-atoms):</p>

<p>
    <table cellspacing="10">
	<tr>
	    <td>POL</td><td>PEG</td><td>XX</td><td>DYE</td><td>PRT</td>
        </tr> 
    </table>
</p>

<p>The detailed description of homology interpretation is described in the 
<a href="../calculations/homologygroups.html#definition">Homology groups in Markush structures</a>
manual.</p>


<a name="missingattributes"></a><h4>Atom attributes that are not processed by search and enumeration</h4>

<p>The following atom attributes are not processed by search and enumeration
(but displayed in atom labels):</p>
<p>
    <table cellspacing="5">
	<tr><td>CR</td><td>Carbon chain/ring attribute</td></tr>
	<tr><td>DT</td><td>Deuterium and tritium count</td></tr>
	<tr><td>MU</td><td>Multiplier attribute</td></tr>
	<tr><td>NU</td><td>Numerotation attribute (link to AMN data)</td></tr>
	<tr><td>PA</td><td>Polymer indicator</td></tr>
	<tr><td>SP</td><td>Position indicator</td></tr>
    </table>
</p>



<a name="export"></a><h2>Export to VMN format</h2>
<p>VMN export is not available yet.</p> 




<h2><a class="anchor" name="references">References</a></h2>
<ol>
<li>The Markush DARC Format, T Ferns, Internal Technical Report, Thomson Scientific, 2002</li>
<li><a href="http://science.thomsonreuters.com/m/pdfs/mgr/Markush_Darc_User_Manual.pdf">Derwent World Patents Index, Markush DARC User Manual, The Thomson Corporation, 1993, 2008</a></li>
</ol>

</body>
</html>
